FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances - Extended Abstract
نویسندگان
چکیده
Discovering functional dependencies (FDs) from an existing relation instance is an important technique in data mining and database design. To date, even the most e cient solutions are exponential in the number of attributes of the relation (n), even when the size of the output is not exponential in n. Lopes et al. developed an algorithm, Dep-Miner, that works well for large n on randomly-generated integer-valued relation instances [LPL 00a]. Dep-Miner rst reduces the FD discovery problem to that of nding minimal covers for hypergraphs, then employs a levelwise search strategy to determine these minimal covers. Our algorithm, FastFDs, instead employs a depthrst, heuristic driven search strategy for generating minimal covers of hypergraphs. This type of search is commonly used to solve search problems in Arti cial Intelligence (AI) [RN 95]. Our experimental results indicate that the levelwise strategy that is the hallmark of many successful data mining algorithms is in fact signi cantly surpassed by the depthrst, heuristic driven strategy FastFDs employs, due to the inherent space e ciency of the search. Furthermore, we revisit the comparison between Dep-Miner and Tane, including FastFDs. We report several tests on distinct benchmark relation instances, comparing the Dep-Miner and FastFDs hypergraph approaches to Tane's partitioning approach for mining FDs from a relation instance. At the end of the paper (appendix A) we provide experimental data comparing FastFDs with a third algorithm, fdep [FS 99].
منابع مشابه
FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances
Discovering functional dependencies (FDs) from an existing relation instance is an important technique in data mining and database design. To date, even the most eecient solutions are exponential in the number of attributes of the relation (n), even when the size of the output is not exponential in n. Lopes et al. developed an algorithm, Dep-Miner, that works well for large n on randomly-genera...
متن کاملOrdering Depth First Search to Improve AFD Mining
This paper describes a new search algorithm, bottom-up attribute keyness depth-first search (BU-AKD), for mining powerset lattices with the use of a monotonic approximation measure; characteristics present in many problem domains. The research reported here focuses on one of these problem domains, the discovery of Approximate Functional Dependencies (AFDs). AFDs are measured versions of functio...
متن کاملResampling in an Indeenite Database to Approximate Functional Dependencies Research Note Rn/98/10
We reintroduce Numerical Dependencies (NDs), deened originally to enhance database design, within a data mining context where we use ND sets to approximate the satisfaction of a given Functional Dependency (FD) set within a relation. We motivate NDs by examining the use of indeenite information in relations. Indeenite information is represented within the relational model by allowing cells to c...
متن کاملHeuristic and exact algorithms for Generalized Bin Covering Problem
In this paper, we study the Generalized Bin Covering problem. For this problem an exact algorithm is introduced which can nd optimal solution for small scale instances. To nd a solution near optimal for large scale instances, a heuristic algorithm has been proposed. By computational experiments, the eciency of the heuristic algorithm is assessed.
متن کاملA Heuristic Algorithm for Nonlinear Lexicography Goal Programming with an Efficient Initial Solution
In this paper, a heuristic algorithm is proposed in order to solve a nonlinear lexicography goal programming (NLGP) by using an efficient initial point. Some numerical experiments showed that the search quality by the proposed heuristic in a multiple objectives problem depends on the initial point features, so in the proposed approach the initial point is retrieved by Data Envelopment Analysis...
متن کامل